{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--NAVIGATION-->\n",
    "< [使用Numpy计算：通用函数](02.03-Computation-on-arrays-ufuncs.ipynb) | [目录](Index.ipynb) | [在数组上计算：广播](02.05-Computation-on-arrays-broadcasting.ipynb) >\n",
    "\n",
    "<a href=\"https://colab.research.google.com/github/wangyingsm/Python-Data-Science-Handbook/blob/master/notebooks/02.04-Computation-on-arrays-aggregates.ipynb\"><img align=\"left\" src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open in Colab\" title=\"Open and Execute in Google Colaboratory\"></a>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Aggregations: Min, Max, and Everything In Between\n",
    "\n",
    "# 聚合：Min，Max和其他"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> Often when faced with a large amount of data, a first step is to compute summary statistics for the data in question.\n",
    "Perhaps the most common summary statistics are the mean and standard deviation, which allow you to summarize the \"typical\" values in a dataset, but other aggregates are useful as well (the sum, product, median, minimum and maximum, quantiles, etc.).\n",
    "\n",
    "通常来说，当我们面对大量数据时，第一步就是计算数据集的概要统计结果。也许最重要的概要统计数据就是平均值和标准差，它们能归纳出数据集典型的数值，但是其他的聚合函数也很用（如求和、乘积、中位值、最小值和最大值、分位数等）。\n",
    "\n",
    "> NumPy has fast built-in aggregation functions for working on arrays; we'll discuss and demonstrate some of them here.\n",
    "\n",
    "NumPy内建有非常快速的函数用于计算数组的统计值；本节中我们会讨论其中常用的部分。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Summing the Values in an Array\n",
    "\n",
    "## 在数组中求总和\n",
    "\n",
    "> As a quick example, consider computing the sum of all values in an array.\n",
    "Python itself can do this using the built-in ``sum`` function:\n",
    "\n",
    "首先，我们用一个简单例子来计算数组所有元素值的总和。使用Python內建的`sum`函数："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "54.47499738668567"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "L = np.random.random(100)\n",
    "sum(L)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> The syntax is quite similar to that of NumPy's ``sum`` function, and the result is the same in the simplest case:\n",
    "\n",
    "NumPy的`sum`函数的语法也差不多，当然，结果也是一样的。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "54.47499738668566"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.sum(L)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> However, because it executes the operation in compiled code, NumPy's version of the operation is computed much more quickly:\n",
    "\n",
    "然后，因为NumPy的函数是编译执行的，因此它的性能会远远超越Python的內建函数："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "88.3 ms ± 2.84 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n",
      "564 µs ± 17.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
     ]
    }
   ],
   "source": [
    "big_array = np.random.rand(1000000)\n",
    "%timeit sum(big_array)\n",
    "%timeit np.sum(big_array)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> Be careful, though: the ``sum`` function and the ``np.sum`` function are not identical, which can sometimes lead to confusion!\n",
    "In particular, their optional arguments have different meanings, and ``np.sum`` is aware of multiple array dimensions, as we will see in the following section.\n",
    "\n",
    "要注意的是：`sum`内建函数和`np.sum`并不完全相同，这有时会导致混乱。特别的，两个函数的可选参数有着不同的含义，而且`np.sum`函数可以处理多维数组运算，我们将在后续章节看到。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Minimum and Maximum\n",
    "\n",
    "## 最小值和最大值\n",
    "\n",
    "> Similarly, Python has built-in ``min`` and ``max`` functions, used to find the minimum value and maximum value of any given array:\n",
    "\n",
    "类似的，Python也有內建`min`和`max`函数，用来计算数组的最小值和最大值："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(2.5903288636275335e-06, 0.9999992774771906)"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "min(big_array), max(big_array)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> NumPy's corresponding functions have similar syntax, and again operate much more quickly:\n",
    "\n",
    "NumPy对应的函数也有相似的语法，但是执行高效很多："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(2.5903288636275335e-06, 0.9999992774771906)"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "np.min(big_array), np.max(big_array)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "61 ms ± 1.39 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n",
      "744 µs ± 45.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)\n"
     ]
    }
   ],
   "source": [
    "%timeit min(big_array)\n",
    "%timeit np.min(big_array)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> For ``min``, ``max``, ``sum``, and several other NumPy aggregates, a shorter syntax is to use methods of the array object itself:\n",
    "\n",
    "对于`min`，`max`，`sum`和其他NumPy聚合函数来说，也可以通过`ndarray`对象的相应方法进行调用："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2.5903288636275335e-06 0.9999992774771906 499718.9807141967\n"
     ]
    }
   ],
   "source": [
    "print(big_array.min(), big_array.max(), big_array.sum())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> Whenever possible, make sure that you are using the NumPy version of these aggregates when operating on NumPy arrays!\n",
    "\n",
    "任何情况下，当你操作NumPy数组时，你都应该使用NumPy的聚合函数来代替Python的內建函数。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Multi dimensional aggregates\n",
    "\n",
    "### 多维聚合\n",
    "\n",
    "> One common type of aggregation operation is an aggregate along a row or column.\n",
    "Say you have some data stored in a two-dimensional array:\n",
    "\n",
    "还有一种需求，我们可能需要沿着行或列进行聚合。比方说你有一个二维数组："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[0.27614977 0.75224804 0.69322493 0.55140476]\n",
      " [0.6698524  0.92722784 0.20959198 0.78538042]\n",
      " [0.05078426 0.58621268 0.7614707  0.77247016]]\n"
     ]
    }
   ],
   "source": [
    "M = np.random.random((3, 4))\n",
    "print(M)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> By default, each NumPy aggregation function will return the aggregate over the entire array:\n",
    "\n",
    "默认情况下，NumPy聚合函数都会返回整个数组的聚合结果标量："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "7.03601794886263"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "M.sum()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> Aggregation functions take an additional argument specifying the *axis* along which the aggregate is computed. For example, we can find the minimum value within each column by specifying ``axis=0``:\n",
    "\n",
    "聚合函数可以接收一个额外的参数指定一个*轴*让函数沿着这个方向进行聚合运算。例如，我们可以沿着行的方向计算每列的最小值，通过指定`axis=0`参数即可："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0.05078426, 0.58621268, 0.20959198, 0.55140476])"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "M.min(axis=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> The function returns four values, corresponding to the four columns of numbers.\n",
    "\n",
    "这个函数返回四个值，对应着四列。\n",
    "\n",
    "> Similarly, we can find the maximum value within each row:\n",
    "\n",
    "类似的，我们也可以计算每一行的最大值："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0.75224804, 0.92722784, 0.77247016])"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "M.max(axis=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> The way the axis is specified here can be confusing to users coming from other languages.\n",
    "The ``axis`` keyword specifies the *dimension of the array that will be collapsed*, rather than the dimension that will be returned.\n",
    "So specifying ``axis=0`` means that the first axis will be collapsed: for two-dimensional arrays, this means that values within each column will be aggregated.\n",
    "\n",
    "上述指定axis参数的方式可能会让具有其他编程语言基础的用户感到不适应。这里的`axis`参数指定的是*让数组沿着这个方向进行压缩*，而不是指定返回值的方向。因此指定`axis=0`意味着第一个维度将被压缩：对于一个二维数组来说，就是数组将沿着列的方向进行聚合运算操作。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Other aggregation functions\n",
    "\n",
    "### 其他聚合函数\n",
    "\n",
    "> NumPy provides many other aggregation functions, but we won't discuss them in detail here.\n",
    "Additionally, most aggregates have a ``NaN``-safe counterpart that computes the result while ignoring missing values, which are marked by the special IEEE floating-point ``NaN`` value (for a fuller discussion of missing data, see [Handling Missing Data](03.04-Missing-Values.ipynb)).\n",
    "Some of these ``NaN``-safe functions were not added until NumPy 1.8, so they will not be available in older NumPy versions.\n",
    "\n",
    "NumPy提供了许多其他聚合函数，但是我们不会在这里详细讨论它们。需要说明的是，很多聚合函数都有一个`NaN`安全的版本，可以忽略空缺的数据并计算得到正确的结果。`NaN`即为IEEE标准中浮点数非数值的定义（完整的讨论空缺数据的内容请参见[处理空缺数据](03.04-Missing-Values.ipynb)）。部分`NaN`安全的函数版本是在NumPy 1.8之后加入的，因此在老版本的NumPy中可能无法使用。\n",
    "\n",
    "> The following table provides a list of useful aggregation functions available in NumPy:\n",
    "\n",
    "下表列出了NumPy中有用的聚合函数：\n",
    "\n",
    "| 函数名称      |   NaN安全版本  | 说明                                   |\n",
    "|-------------------|---------------------|-----------------------------------------------|\n",
    "| ``np.sum``        | ``np.nansum``       | 计算总和                       |\n",
    "| ``np.prod``       | ``np.nanprod``      | 计算乘积                   |\n",
    "| ``np.mean``       | ``np.nanmean``      | 计算平均值                      |\n",
    "| ``np.std``        | ``np.nanstd``       | 计算标准差                    |\n",
    "| ``np.var``        | ``np.nanvar``       | 计算方差                              |\n",
    "| ``np.min``        | ``np.nanmin``       | 计算最小值                            |\n",
    "| ``np.max``        | ``np.nanmax``       | 计算最大值                            |\n",
    "| ``np.argmin``     | ``np.nanargmin``    | 寻找最小值的序号                   |\n",
    "| ``np.argmax``     | ``np.nanargmax``    | 寻找最大值的序号                   |\n",
    "| ``np.median``     | ``np.nanmedian``    | 计算中位值                    |\n",
    "| ``np.percentile`` | ``np.nanpercentile``| 计算百分比分布的对应值     |\n",
    "| ``np.any``        | N/A                 | 是否含有True值        |\n",
    "| ``np.all``        | N/A                 | 是否全为True值        |\n",
    "\n",
    "> We will see these aggregates often throughout the rest of the book.\n",
    "\n",
    "我们在本书后续内容中会经常看到这些聚合函数。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example: What is the Average Height of US Presidents?\n",
    "\n",
    "## 例子：美国总统的平均身高？"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> Aggregates available in NumPy can be extremely useful for summarizing a set of values.\n",
    "As a simple example, let's consider the heights of all US presidents.\n",
    "This data is available in the file *president_heights.csv*, which is a simple comma-separated list of labels and values:\n",
    "\n",
    "在NumPy中使用聚合统计来对一个数据集进行概要说明是非常有用的。下面我们使用美国总统的身高作为一个简单的例子来说明。这些数据存储在文件*president_heights.csv*里，文件格式就是简单的逗号分隔的文本文件："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "order,name,height(cm)\r\n",
      "1,George Washington,189\r\n",
      "2,John Adams,170\r\n",
      "3,Thomas Jefferson,189\r\n"
     ]
    }
   ],
   "source": [
    "!head -4 data/president_heights.csv"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> We'll use the Pandas package, which we'll explore more fully in [Chapter 3](03.00-Introduction-to-Pandas.ipynb), to read the file and extract this information (note that the heights are measured in centimeters).\n",
    "\n",
    "我们会使用Pandas包来读取文件和提取数据（注意身高单位是厘米），Pandas的相关内容我们会在[第三章](03.00-Introduction-to-Pandas.ipynb)中详细介绍。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[189 170 189 163 183 171 185 168 173 183 173 173 175 178 183 193 178 173\n",
      " 174 183 183 168 170 178 182 180 183 178 182 188 175 179 183 193 182 183\n",
      " 177 185 188 188 182 185]\n"
     ]
    }
   ],
   "source": [
    "import pandas as pd\n",
    "data = pd.read_csv('data/president_heights.csv')\n",
    "heights = np.array(data['height(cm)'])\n",
    "print(heights)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> Now that we have this data array, we can compute a variety of summary statistics:\n",
    "\n",
    "获得了NumPy数组之后，我们就能计算各种的基本统计数据了："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Mean height:        179.73809523809524\n",
      "Standard deviation: 6.931843442745892\n",
      "Minimum height:     163\n",
      "Maximum height:     193\n"
     ]
    }
   ],
   "source": [
    "print(\"Mean height:       \", heights.mean()) # 身高平均值\n",
    "print(\"Standard deviation:\", heights.std()) # 标准差\n",
    "print(\"Minimum height:    \", heights.min()) # 最小值\n",
    "print(\"Maximum height:    \", heights.max()) # 最大值"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> Note that in each case, the aggregation operation reduced the entire array to a single summarizing value, which gives us information about the distribution of values.\n",
    "We may also wish to compute quantiles:\n",
    "\n",
    "上述结果中，每个聚合函数都将整个数组计算后得到一个标量值，可以让我们初步了解数据的基本分布信息。下面来计算分位值："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "25th percentile:    174.25\n",
      "Median:             182.0\n",
      "75th percentile:    183.0\n"
     ]
    }
   ],
   "source": [
    "print(\"25th percentile:   \", np.percentile(heights, 25)) # 25% 分位值\n",
    "print(\"Median:            \", np.median(heights)) # 50% 分位值 - 中位值\n",
    "print(\"75th percentile:   \", np.percentile(heights, 75)) # 75% 分位值"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> We see that the median height of US presidents is 182 cm, or just shy of six feet.\n",
    "\n",
    "我们看到美国总统身高的中位值是182厘米，也就是6英尺。\n",
    "\n",
    "> Of course, sometimes it's more useful to see a visual representation of this data, which we can accomplish using tools in Matplotlib (we'll discuss Matplotlib more fully in [Chapter 4](04.00-Introduction-To-Matplotlib.ipynb)). For example, this code generates the following chart:\n",
    "\n",
    "当然，有时对数据进行图表展示会更加直观，我们可以通过Matplotlib工具进行（Matplotlib的知识会在[第四章](04.00-Introduction-To-Matplotlib.ipynb)详细介绍）。例如，下述代码产生相应的图表："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn; seaborn.set()  # 设置图表的风格为seaborn"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYUAAAEcCAYAAAAoSqjDAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjAsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+17YcXAAAdOUlEQVR4nO3deZwdZZno8V8nkIAmLLaNGMKq8rAMykW5eh0ccBlHh83tMiIquKNXuTo4o7iBOiByQRBxG0Flu6CDLLIoXHT0goLKdRd9ECUIBjQEkCQKhKTvH291eXLSnZzTOd11uvv3/Xzy6Zw6daqet+qceup936q3BoaHh5EkCWBW0wFIkvqHSUGSVDMpSJJqJgVJUs2kIEmqmRQkSbWNmg5Akysi3gPslJmv72De44AnZuYrJzim5cCTM/O3PVhWXb6I2AG4Ddg4Mx/pwbK3A24GNs/MVRu6vC7W+zjgP4D/Avx7Zh49WeueSBHxNeDCzDx7lPd2oIf7Tp0zKUwxEbEIeH1mXtsy7Yhq2j7r+3xmnjCRsbS9vx/wTeDP1aT7ge8C/yszf9AS07wO1rUfcF5mLlzXfBNZvsz8HbDeWCfAG4F7gM0yc60biyLiW5Rtc2bLtP1o2V4RcTDwQWAn4GHgJ8DrMnPRKMv7IvCKar6Hgf8HvC0zf9XLQmXmC3u5vLGMtn00NpuPNNEWVwf9+cAzgF8B10XEc3u9ooiYric52wM3j5YQOhERTwTOAY4GNgd2BD4FrF7Hx06q9ttC4I/AF8dY9nTd5jOWO3QaiogFwCeAvwOWA6dm5unVe8fR0iQUEa8GPkw5Az4NeB1rnv3PiYhzgBcDvwMOz8ybIuJcYDvg8ohYBXwoM08aK6bqgHYn8IGIeAzwUeBpVQzDwJMy89aI+EfgZGBb4AHgVODTwNeAuVVTE8DOlDPovwEeBA4C/jkiFrJ2k9drq3IPACdn5inVer8I3JmZ76te70d1dj1a+YAv09KkUW3nzwD7APcCH83Mz7Vs592q2NbYdqNtn4h4JvDxqly3AP8zM79bxXgYMBwRbwdeNFbNbB32BG7LzG9Ur5cBX+nkg5n554j438CXWsrVvs0/D/wr8AZgC+AbwJGZeW9EbAKcCbwQmA38GjggM//QegYfEbMp34kjKPv9lNY4ImJz4GPAP1KS2ReAYzNz1UhNGbiR8v29H3hLZn4tIo4HngU8IyJOoyS3t1XLOgyYC9wOvCIzf97JNpnurClMMxExC7ic0jywDfBc4O0R8Q+jzLsb5YzxMODxlLPIbdpmOwi4kPJj/ypwBkBmvopyoDswM+etKyGM4mJgr4h49CjvnQW8KTPnUw4+38zMFZSDyuJqXfMyc3E1/8HARVV854+xvmcDTwKeD7w7Ip63vgA7LN8FlES3AHgZcEJbDWjUbdeuSpJXAqcDg5QD1pURMZiZR1TlOqmKo9uEAPBDYJeIODUinh0RHTeBVfMeBvyoZXL7Nj8KeBGwL2Vb3Ad8spr3cMr3atuqbEcCfxllVW8ADqD0mzyNsj1bnQ08Ajyxmuf5lEQw4ulAAo8FTgLOioiBzHwvcB3w1mr7vbX67N9REvAWwD8BSzvdJtOdNYWp6dKIaO18m0P54QPsDQxl5oeq17+NiM8BLweublvOy4DLM/N6gIj4AOUH3ur6zLyqev9c4O09iH8x5ax9C2BF23srgd0i4ieZeR/lALMuN2TmpdX//xIRo83zwSqx/CwivgAcCozn4FqLiG0pNYQDMvNB4McRcSbwKsqZMnS+7fYHfp2Z51avL4iIo4ADGaPZphuZ+duqFvTPlNrO/Ii4kHKgXD7Gx94ZEW+l1Ai+TzmDH9G+zd9ULetOqGsTv4uIV1H25yCl9vZTSv/EaA4BTsvMO6plfATYr/r/4ygnBVtk5l+AFRFxKqWm+Nnq87e31NLOppzsPA64e5R1raQ0Z+4CfD8zfzlGTDOSSWFqWqMJoaX6DKX9eUFE3N8y/2zK2VK7BcAdIy+qpoL2M6bWH9WfgU0iYqMNvCJkG2CYUs1v91LgfcCJEfFT4N2ZecM6lnXHOt4bbZ7bgT06DXQdFgD3ZuaytmU/reV1p9tuQfXZVrezdq1tLI8AG7dN25hy8AMgM2+kHHiJiL0pzUHvBY4ZY5knjzSrjaJ9m28PXBIRrX0UqygH5XMptYQLI2IL4DzgvZm5sm0Za3wXWXN7bF+V566WpD+rbf56W1ffYxjjooDM/GZEnEGpzWwXEZcA78zMB0Yv7sxiUph+7qC0Hz+pg3nvAupfWURsSjmr69R4h9h9MfDD6ux9DdVVSQdHxMbAWylnttuuY12dxLAtpYMbSj/BSNPTCuBRLfNt3cWyFwOPiYj5LYlhO+D3HcQz2rK2b5u2HfD1Dj//O2CHtmk7snaiAco2joiLKc1z49G+Xe4AXpuZ3xlj/g8CH6wuM72K0sxzVts8d1H204jt2pb/EPDYcZ6MrLUfqz620yNiK8p37F+A949j2dOOSWH6+T7wQES8i9JG/TCwK7Bp62WglYuAG6tOzpsoP96BLtb1B8oljusVEQOUs8HXV/8OGmWeOcB/B67IzD9FxAOUM86RdQ1GxOaZ+acuYgR4f0S8gXKgfA0w0gn9Y+DoiPg3ShNce/POmOXLzDsi4rvARyLinZT26de1LLsbVwGfiIhXUA5QL6V0Ul/R4ee/BJwbEV8BfkDpP3kHpeOaiNiH8h24LDP/GBG7ULb/WvcHjNNngOMj4vDMvD0ihoBnZuZlEfFsyuW0N1M6kFfy133a6svAURFxBSVZv3vkjcy8KyKuAU6JiPdTLp7YEViYmd/uIL419mNVU5pFaXJdQWkim7T7TvqdHc3TTHVT1YFUV5xQfpBnUjr72uf9BeVKjAspZ2rLKJcfPtTh6j4CvC8i7q8OjKNZUF0xtJxywNoD2C8zrxlj/lcBi6qEcCTVQba6Rv4CSh/J/dWVP536NnArpa3/5JZ1n0vpkF8EXEN1hU0X5TuUcoa+GLiEcjXM/+kiLgAycymlk/VoSofnv1L6Ku7p8PNXUw6iXwD+REkyZwP/Xs1yPyUJ/KzaF1+v4u3m4oB1+TilI/2aiFhGuQro6dV7W1NOPh4AfknZF+eNsozPUfq8fkI5WF/c9v6rKYn7Zko/00WUiyM6je9lEXFfRJwObFat7z5KbWop5Yo3AQM+ZEcjqitN7qdcHnpb0/FImnw2H81wEXEg5Qx6gHK29DPKmbOkGcjmIx1Maf5YTGmLfvl475yVNPXZfCRJqllTkCTVpnqfwlzKHbx34SVlktSp2ZSrt35A29WGUz0p7M3od+pKktbvWcD1rROmelK4C+C++1awenXzfSODg/NYunSsoWSmFsvSf6ZLOcCyNG3WrAG23PLRUB1DW031pLAKYPXq4b5ICkDfxNELlqX/TJdygGXpE2s1u9vRLEmqmRQkSTWTgiSpZlKQJNUmraM5Ik6mDAm8A7BHZv48IgYpI1U+gXKt7K2URzEumay4JEl/NZk1hUspz0VtffDHMOXZs5GZTwZ+A5w4iTFJklpMWk2h5TnArdPuBb7VMtuNwJsnKyZJ0pr65j6FiJhFSQhf7fazg4OjPoq1EUND85sOoWcsS//ptBwPr1zFnI1nT3A0G7be6bJPYHqVpW+SAvAJytO5zuj2g0uXLu+Lm0eGhuazZMmy9c84BViW/tNNOYaG5nPg0ZdNcERru/yUgzuKcbrsE5iaZZk1a2DMk+m+SApVJ/STgAMzc3XT8UjSTNV4UoiI44GnAvtnZqfPBpYkTYDJvCT1dOAllAd5XxsRS4FDgPcAtwDfrTqhb8vMF09WXJKkv5rMq4+OAo4a5a2ByYpBkrRu3tEsSaqZFCRJNZOCJKlmUpAk1UwKkqSaSUGSVDMpSJJqJgVJUs2kIEmqmRQkSTWTgiSpZlKQJNVMCpKkmklBklQzKUiSaiYFSVLNpCBJqpkUJEk1k4IkqWZSkCTVTAqSpJpJQZJUMylIkmomBUlSzaQgSaptNBkriYiTgZcCOwB7ZObPq+k7A2cDg8BS4NWZ+evJiEmStLbJqilcCvwdcHvb9M8An8zMnYFPAp+dpHgkSaOYlKSQmddn5h2t0yJiK2Av4IJq0gXAXhExNBkxSZLW1mSfwrbA7zNzFUD1d3E1XZLUgEnpU5hog4Pzmg6hNjQ0v+kQesay9J+pUI5OY5wKZenUdCpLk0nhDmCbiJidmasiYjawoJrelaVLl7N69XDPA+zW0NB8lixZ1nQYPWFZ+k835WjyINVJjNNln8DULMusWQNjnkw31nyUmX8EfgwcWk06FPhRZi5pKiZJmukmJSlExOkRcSewELg2In5RvXUk8LaIuAV4W/VaktSQSWk+ysyjgKNGmf4r4OmTEYMkaf28o1mSVDMpSJJqJgVJUs2kIEmqmRQkSTWTgiSpZlKQJNVMCpKkmklBklQzKUiSaiYFSVLNpCBJqpkUJEk1k4IkqWZSkCTVTAqSpJpJQZJUMylIkmomBUlSzaQgSaqZFCRJNZOCJKlmUpAk1UwKkqSaSUGSVDMpSJJqGzUdAEBEHAB8GBigJKrjMvPiZqOSpJmn8ZpCRAwA5wKvysw9gVcCZ0dE47FJ0kzTLwfe1cDm1f+3AO7KzNUNxiNJM1LjzUeZORwRhwCXRcQKYD6wfzfLGBycNyGxjcfQ0PymQ+gZy9J/pkI5Oo1xKpSlU9OpLI0nhYjYCDgGODgzvxMRfwt8KSJ2y8zlnSxj6dLlrF49PKFxdmJoaD5LlixrOoyesCz9p5tyNHmQ6iTG6bJPYGqWZdasgTFPpvuh+WhPYEFmfgeg+rsC2LXRqCRpBuqHpHAnsDAiAiAidgW2Bn7TaFSSNAM13nyUmXdHxJuBiyJipHP5NZl5b5NxSdJM1HhSAMjM84Hzm45Dkma6fmg+kiT1CZOCJKnWcVLwDmNJmv46OtBHxGxgRUTMneB4JEkN6igpZOYq4BZgcGLDkSQ1qZurj84HroiIj1PuLahvIc7Mb/Y6MEnS5OsmKby5+ntc2/RhYKeeRCNJalTHSSEzd5zIQCRJzevq5rWI2Bh4BmWsoi9FxKMBMnPFRAQnSZpc3VySugels/lzwFnV5H2Bz09AXJKkBnRz78GngQ9k5i7Aymrat4F9eh6VJKkR3SSF3YHzqv8PQ91stGmvg5IkNaObpLAIeGrrhIj4r8CtvQxIktScbjqa3w9cGRGfAeZExDHAkcAbJiQySdKk67imkJlXAC8Ehih9CdsDL8nMayYoNknSJOvqktTM/CHwlgmKRZLUsI6TQkTMAd4HHAosABYDFwLHZ+aDExOeJGkydVNT+DQQwFHA7ZTmo2OAbYDX9j40SdJk6yYpvAh4QmbeX72+OSK+R7n6yKQgSdNAN5ek3g08qm3apsBdvQtHktSkddYUIuI5LS/PBb4eEZ+gDJ29LfA/gHMmLjxJ0mRaX/PRWaNMe0/b6zcBH+1NOJKkJq0zKThctiTNLN30KUiSprlu7lN4CnAqsCcwr5o8AAxn5pwJiE2SNMm6uST1AuArlPsU/tLLICJiE0rCeR7wIHBDZr6xl+uQJK1fN0lha8rzFIYnII6TKMlg58wcjojHTcA6JEnr0U1SOBt4BXB+LwOIiHnAq4GFIwknM//Qy3VIkjrTTVI4EbghIt4DrHHQzsznjP6RjjwBWAocGxHPBpYD78vM6zdgmZKkcegmKVwE3AZcQm/7FDYCdgJ+lJn/EhFPBy6PiCdm5gOdLGBwcN76Z5okQ0Pzmw6hZyzLhnl45SrmbDy7p8ucCvuk0xinQlk6NZ3K0k1S2BMYzMyHexzD7cAjlI5sMvN7EXEPsDNwUycLWLp0OatXT0RXR3eGhuazZMmypsPoCcvSm/UeePRlk75egMtPObiR9QIdbWu/X82aNWtgzJPpbu5TuA7YrScRtcjMe4D/BP4eICJ2BrbCx3xK0qTrpqZwG3BNRFzC2n0KH9jAOI4EPh8RpwArgVe1jMYqSZok3SSFRwFXAnMog+H1TGb+Ftivl8uUJHWv46SQma+ZyEAkSc3rZpiLncZ6rzrTlyRNcd00H90KDFPGOxoxcslPb6+7kyQ1opvmozWuVIqIrYFjKVclSZKmgXEPnZ2ZdwNvBz7Su3AkSU3a0OcpBGs/t1mSNEV109F8HX/tQwB4NOVmtg/3OihJUjO66Wg+s+31CuAnmfnrHsYjSWpQN81HFwJzgacDzwUOAt4fEedMRGCSpMnXTU3hi8BTgMtpG+ZCkjQ9dJMUXgDs6JhEkjR9ddN89DtK85EkaZrqpqZwDnBZRHyctUdJ/WZPo9K0M3+zTdlkbjdft955eOWqRtaryeP3q3e62Ypvrf6e0DZ9mPLkNGlMm8zdaEY+cEaTw+9X73QzzMWOExmIJKl5G3pHsyRpGjEpSJJqJgVJUs2kIEmqmRQkSTWTgiSpZlKQJNVMCpKkmklBklQzKUiSaiYFSVKtr5JCRBwbEcMR8TdNxyJJM1HfJIWI2At4BuW5DZKkBvRFUoiIucAngbdQhuKWJDWgmadSrO1DwHmZeVtEdP3hwcF5vY9onIaG5jcdQs9YFo1Hp9t6Ou2T6VSWxpNCRPw3YG/g3eNdxtKly1m9uvkKxtDQfJYsWdZ0GD3R67I0/aNpYr80XeamdLKt/X41a9asgTFPpvuh+WhfYBfgtohYBCwEro6I5zcZlCTNRI3XFDLzRODEkddVYjggM3/eVEySNFP1Q01BktQnGq8ptMvMHZqOQZJmKmsKkqSaSUGSVDMpSJJqJgVJUs2kIEmqmRQkSTWTgiSpZlKQJNVMCpKkmklBklQzKUiSaiYFSVKt7wbE08Sav9mmbDK3s93e9INLeuXhlaumTVn6XTfb2n3Sn0wKM8wmczfiwKMvm/T1Xn7KwZO+zhFzNp4948rcFLf11GfzkSSpZlKQJNVMCpKkmklBklQzKUiSaiYFSVLNpCBJqpkUJEk1k4IkqWZSkCTVTAqSpFrjYx9FxCBwLvAE4CHgVuBNmbmk0cAkaQbqh5rCMHBSZkZmPhn4DXBiwzFJ0ozUeE0hM+8FvtUy6Ubgzc1EI0kzWz/UFGoRMYuSEL7adCySNBM1XlNo8wlgOXBGNx8aHJw3MdGMQycPDnl45SrmbDx7EqKRNNGaeojTRB1H+iYpRMTJwJOAAzNzdTefXbp0OatXD09MYF0YGprPkiXLOpqviQeRgA8jkXqtyQcLdXK8Gc2sWQNjnkz3RVKIiOOBpwL7Z+ZDTccjSTNV40khInYH3gPcAnw3IgBuy8wXNxqYJM1AjSeFzPwFMNB0HJKkPrv6SJLULJOCJKlmUpAk1UwKkqSaSUGSVDMpSJJqJgVJUs2kIEmqmRQkSTWTgiSpZlKQJNVMCpKkWuMD4jVp/mabssnc3m6CJh62IUm9MqOTwiZzN2rs4RiS1I9sPpIk1UwKkqSaSUGSVDMpSJJqJgVJUs2kIEmqmRQkSTWTgiSpZlKQJNVMCpKkmklBklQzKUiSan0xIF5E7AycDQwCS4FXZ+avm41KkmaefqkpfAb4ZGbuDHwS+GzD8UjSjNR4TSEitgL2Av6+mnQBcEZEDGXmkvV8fDbArFkD417/VltuOu7Pboim1tvkui3zzFj3TFtvk+se77Gv5XOz298bGB4e3oCQNlxEPBU4JzN3b5l2M/DKzPzhej6+D3DdRMYnSdPYs4DrWyc0XlPYQD+gFOouYFXDsUjSVDEbeDzlGLqGfkgKdwDbRMTszFwVEbOBBdX09XmItiwnSerIb0ab2HhHc2b+EfgxcGg16VDgRx30J0iSeqzxPgWAiNiFcknqlsB9lEtSs9moJGnm6YukIEnqD403H0mS+odJQZJUMylIkmomBUlSrR/uU5gSIuJk4KXADsAemfnzavomwKnA84AHgRsy843Ve4uqaQ9Wi3lXZl49qYGPYrSyRMQOwKUts20BbJaZj6k+03eDFo6zHIuYIvukmn4A8GFggHISd1xmXly913f7BMZdlkVMrf2yP6UsGwP3Akdk5m3Ve325XzplUujcpcDHWXtYjZMoX+SdM3M4Ih7X9v7LRr5IfWStsmTmImDPkdcRcRprfj9GBi08LyJeSRm08DmTEu3YxlMOmCL7JCIGgHOBZ1UJ78nAdyLi0sxcTX/uExhfWWDq7JctKQf9Z2bmLdW2/zTwgmqWft0vHbH5qEOZeX1mrnGXdUTMA14NvD8zh6v5/tBEfN0YrSytImIOcBjw+er1yKCFF1SzXADsFRFDEx3runRbjn62jrKsBjav/r8FcFdmru7XfQLdl2XyIuveGGV5IvCHzLylen0V8A8R8dh+3i+dMilsmCdQqofHRsRNEfGtiNinbZ7zI+KnEfGpiNiigRjH4yDg9y0DEm5bvV4FUP1dXE3vZ+3lGDEl9kl1onEIcFlE3E45az28entK7ZP1lGXElNgvwC3A1hGxd/X6sOrvdkyx/TIak8KG2QjYiTIsx9OAdwEXR8Rm1fvPysynAHtT2lHPaCbMrr2WKXB23YHRyjFl9klEbAQcAxycmdsDBwJfqmqoU0oHZZky+yUz/wT8E3BqRNwEbAXcD6xsNLAeMSlsmNuBR6iqipn5PeAeYOfq9R3V34eATwF/20yYnYuIBcC+wPktk+tBC6t5uhm0sBFjlGOq7ZM9gQWZ+R2A6u8KYFem3j5ZV1mm2n4hM6/NzH2qk8EzgE2B3zL19staTAobIDPvAf6T6gFB1VUHWwG3RsSjI2LzavoA8HLKwH/97gjgysxcOjJhig5aeARt5ZiC++ROYGFEBEBE7ApsDfxmCu6TMcsyBfcLEbF19XcWcALwmcxcMQX3y1oc+6hDEXE68BLKF/keYGlm7h4RO1GaKAYp1cf3ZubXqulfoYxbPhu4GTgqM+9qpAAtxipL9d4tlDi/3vaZvhu0sNtyTMV9EhGHAe+mdNICHJuZl1af6bt9At2XZYrulzMptZk5wDXAOzLzweozfblfOmVSkCTVbD6SJNVMCpKkmklBklQzKUiSaiYFSVLNpKBpJyIWRcTzxvnZX0TEfhO9npZlPD8iLl3/nF0v9+KIeMH655TW5CipUouR+xw2VJVYzsvMheuZ9QTgrb1YZ5sTKSN3fn19M0qtrClIDakGVNs8M2/s9bIz8/vAZhHxtF4vW9ObNQVNV3tGxMeA7Slny4e33HF6APBvlAen3AwcmZk/rd5bBLw+M6+NiE0pY+MfBNwNfIFyp+3Cda2Hclfu14C5EbG8mm/nzFzcFuMLgW+3ToiI3YHTgKdS7pD/eGaeEBHHAbsDDwEHA4soD395KfCOavrrMvOalsV9C9gfuKnDbSZZU9C0dQjloSc7Ak+mjIVEROxFGZbkTZShST4LfDUi5o6yjGMpiWMnyvhWr+xkPZm5gnLAX5yZ86p/7QkBYA+gHv4gIuYD11KSywLKuP3faJn/QMqDarYEfgRcTfkNbwN8qCpLq18CTxllvdKYTAqark7PzMWZeS9wOX99GtsbgM9m5vcyc1Vmnk05y37GKMs4BDghM+/LzDuB07tYTye2AJa1vD4AuDszT8nMBzNzWTXy7ojrMvPqzHwE+A9gCDgxM1cCFwI7tD2HYFm1DqljNh9purq75f9/ppx5Q2nmOTwi3tby/pyW91u1D3k82vDHY62nE/cB81tebwv8Zh3ztz7V7y/APSMPc6leA8yjjO1Ptez7kbpgUtBMcwdwfGYe38G8dwELKf0O0N3TszoZafKnVM/eaInt0DHmHY9dgZ/0cHmaAUwKmmk+B1wSEdcC3wceBewH/N/MXNY275eBYyLiB9V83Vw6+gdgMCI2r57UNZqrKM0+I64APhYRb6dcTjoH2K2tCakb+zJ6P4g0JvsUNKNk5k2UfoUzKM03t1J1Qo/iQ5SHw9xG6QC+iNL/0Ml6fkV5It9vI+L+6klw7fP8EPhTRDy9er2M0qF9IKVZ6tfAszstW6vqctcV1aWpUsd8noLUoYh4M/DyzNy3h8t8PvCWzHxRr5ZZLfcrwFmZeVUvl6vpz6QgjSEiHk+5HPUG4EnAlcAZmXlao4FJE8g+BWlscyjX/u9IuYrnQspD5aVpy5qCJKlmR7MkqWZSkCTVTAqSpJpJQZJUMylIkmomBUlS7f8D2WssMMzSmegAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.hist(heights)\n",
    "plt.title('Height Distribution of US Presidents')\n",
    "plt.xlabel('height (cm)')\n",
    "plt.ylabel('number');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> These aggregates are some of the fundamental pieces of exploratory data analysis that we'll explore in more depth in later chapters of the book.\n",
    "\n",
    "这些聚合数据提供了我们对于数据集最基本的理解，我们会在本书后续章节更加深入的讨论它们。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--NAVIGATION-->\n",
    "< [使用Numpy计算：通用函数](02.03-Computation-on-arrays-ufuncs.ipynb) | [目录](Index.ipynb) | [在数组上计算：广播](02.05-Computation-on-arrays-broadcasting.ipynb) >\n",
    "\n",
    "<a href=\"https://colab.research.google.com/github/wangyingsm/Python-Data-Science-Handbook/blob/master/notebooks/02.04-Computation-on-arrays-aggregates.ipynb\"><img align=\"left\" src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open in Colab\" title=\"Open and Execute in Google Colaboratory\"></a>\n"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
